Goto

Collaborating Authors

 lead time


Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning

Groom, Michael, Bassetti, Davide, Horenko, Illia, O'Kane, Terence J.

arXiv.org Machine Learning

This paper introduces a distillation framework for an ensemble of entropy-optimal Sparse Probabilistic Approximation (eSPA) models, trained exclusively on satellite-era observational and reanalysis data to predict ENSO phase up to 24 months in advance. While eSPA ensembles yield state-of-the-art forecast skill, they are harder to interpret than individual eSPA models. We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions. This process yields a single, diagnostically tractable model for each forecast lead time that preserves forecast performance while also enabling diagnostics that are impractical to implement on the full ensemble. An analysis of the regime persistence of the distilled model "superclusters", as well as cross-lead clustering consistency, shows that the discretised system accurately captures the spatiotemporal dynamics of ENSO. By considering the effective dimension of the feature importance vectors, the complexity of the input space required for correct ENSO phase prediction is shown to peak when forecasts must cross the boreal spring predictability barrier. Spatial importance maps derived from the feature importance vectors are introduced to identify where predictive information resides in each field and are shown to include known physical precursors at certain lead times. Case studies of key events are also presented, showing how fields reconstructed from distilled model centroids trace the evolution from extratropical and inter-basin precursors to the mature ENSO state. Overall, the distillation framework enables a rigorous investigation of long-range ENSO predictability that complements real-time data-driven operational forecasts.





Scaling transformer neural networks for skillful and reliable medium-range weather forecasting Tung Nguyen

Neural Information Processing Systems

Recently, data-driven approaches for weather forecasting based on deep learning have shown great promise, achieving accuracies that are competitive with operational systems. However, those methods often employ complex, customized architectures without sufficient ablation analysis, making it difficult to understand what truly contributes to their success.


OceanForecastBench: A Benchmark Dataset for Data-Driven Global Ocean Forecasting

Jia, Haoming, Han, Yi, Wang, Xiang, Wang, Huizan, Wu, Wei, Zheng, Jianming, Xiao, Peikun

arXiv.org Machine Learning

Global ocean forecasting aims to predict key ocean variables such as temperature, salinity, and currents, which is essential for understanding and describing oceanic phenomena. In recent years, data-driven deep learning-based ocean forecast models, such as XiHe, WenHai, LangYa and AI-GOMS, have demonstrated significant potential in capturing complex ocean dynamics and improving forecasting efficiency. Despite these advancements, the absence of open-source, standardized benchmarks has led to inconsistent data usage and evaluation methods. This gap hinders efficient model development, impedes fair performance comparison, and constrains interdisciplinary collaboration. To address this challenge, we propose OceanForecastBench, a benchmark offering three core contributions: (1) A high-quality global ocean reanalysis data over 28 years for model training, including 4 ocean variables across 23 depth levels and 4 sea surface variables. (2) A high-reliability satellite and in-situ observations for model evaluation, covering approximately 100 million locations in the global ocean. (3) An evaluation pipeline and a comprehensive benchmark with 6 typical baseline models, leveraging observations to evaluate model performance from multiple perspectives. OceanForecastBench represents the most comprehensive benchmarking framework currently available for data-driven ocean forecasting, offering an open-source platform for model development, evaluation, and comparison. The dataset and code are publicly available at: https://github.com/Ocean-Intelligent-Forecasting/OceanForecastBench.


A Diffusion-Based Framework for High-Resolution Precipitation Forecasting over CONUS

Vicens-Miquel, Marina, McGovern, Amy, Hill, Aaron J., Foufoula-Georgiou, Efi, Guilloteau, Clement, Shen, Samuel S. P.

arXiv.org Artificial Intelligence

Accurate precipitation forecasting is essential for hydrometeorological risk management, especially for anticipating extreme rainfall that can lead to flash flooding and infrastructure damage. This study introduces a diffusion-based deep learning (DL) framework that systematically compares three residual prediction strategies differing only in their input sources: (1) a fully data-driven model using only past observations from the Multi-Radar Multi-Sensor (MRMS) system, (2) a corrective model using only forecasts from the High-Resolution Rapid Refresh (HRRR) numerical weather prediction system, and (3) a hybrid model integrating both MRMS and selected HRRR forecast variables. By evaluating these approaches under a unified setup, we provide a clearer understanding of how each data source contributes to predictive skill over the Continental United States (CONUS). Forecasts are produced at 1-km spatial resolution, beginning with direct 1-hour predictions and extending to 12 hours using autoregressive rollouts. Performance is evaluated using both CONUS-wide and region-specific metrics that assess overall performance and skill at extreme rainfall thresholds. Across all lead times, our DL framework consistently outperforms the HRRR baseline in pixel-wise and spatiostatistical metrics. The hybrid model performs best at the shortest lead time, while the HRRR-corrective model outperforms others at longer lead times, maintaining high skill through 12 hours. To assess reliability, we incorporate calibrated uncertainty quantification tailored to the residual learning setup. These gains, particularly at longer lead times, are critical for emergency preparedness, where modest increases in forecast horizon can improve decision-making. This work advances DL-based precipitation forecasting by enhancing predictive skill, reliability, and applicability across regions.


FuXi-Nowcast: Meet the longstanding challenge of convective initiation in nowcasting

Chen, Lei, Zhu, Zijian, Zhuang, Xiaoran, Qi, Tianyuan, Feng, Yuxuan, Zhong, Xiaohui, Li, Hao

arXiv.org Artificial Intelligence

Accurate nowcasting of convective storms remains a major challenge for operational forecasting, particularly for convective initiation and the evolution of high-impact rainfall and strong winds. Here we present FuXi-Nowcast, a deep-learning system that jointly predicts composite radar reflectivity, surface precipitation, near-surface temperature, wind speed and wind gusts at 1-km resolution over eastern China. FuXi-Nowcast integrates multi-source observations, such as radar, surface stations and the High-Resolution Land Data Assimilation System (HRLDAS), with three-dimensional atmospheric fields from the machine-learning weather model FuXi-2.0 within a multi-task Swin-Transformer architecture. A convective signal enhancement module and distribution-aware hybrid loss functions are designed to preserve intense convective structures and mitigate the rapid intensity decay common in deep-learning nowcasts. FuXi-Nowcast surpasses the operational CMA-MESO 3-km numerical model in Critical Success Index for reflectivity, precipitation and wind gusts across thresholds and lead times up to 12 h, with the largest gains for heavy rainfall. Case studies further show that FuXi-Nowcast more accurately captures the timing, location and structure of convective initiation and subsequent evolution of convection. These results demonstrate that coupling three-dimensional machine-learning forecasts with high-resolution observations can provide multi-hazard, long-lead nowcasts that outperforms current operational systems.


FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching

Ribeiro, Bernardo Perrone, Pucer, Jana Faganeli

arXiv.org Artificial Intelligence

Radar-based precipitation nowcasting, the task of forecasting short-term precipitation fields from previous radar images, is a critical problem for flood risk management and decision-making. While deep learning has substantially advanced this field, two challenges remain fundamental: the uncertainty of atmospheric dynamics and the efficient modeling of high-dimensional data. Diffusion models have shown strong promise by producing sharp, reliable forecasts, but their iterative sampling process is computationally prohibitive for time-critical applications. We introduce FlowCast, the first end-to-end probabilistic model leveraging Conditional Flow Matching (CFM) as a direct noise-to-data generative framework for precipitation nowcasting. Unlike hybrid approaches, FlowCast learns a direct noise-to-data mapping in a compressed latent space, enabling rapid, high-fidelity sample generation. Our experiments demonstrate that FlowCast establishes a new state-of-the-art in probabilistic performance while also exceeding deterministic baselines in predictive accuracy. A direct comparison further reveals the CFM objective is both more accurate and significantly more efficient than a diffusion objective on the same architecture, maintaining high performance with significantly fewer sampling steps. This work positions CFM as a powerful and practical alternative for high-dimensional spatiotemporal forecasting.


XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

Wang, Wuxin, Ni, Weicheng, Huang, Lilan, Hao, Tao, Fei, Ben, Ma, Shuo, Yuan, Taikang, Zhao, Yanlai, Deng, Kefeng, Li, Xiaoyong, Leng, Hongze, Duan, Boheng, Bai, Lei, Zhang, Weimin, Ren, Kaijun, Song, Junqiang

arXiv.org Artificial Intelligence

Artificial intelligence (AI)-driven models have the potential to revolutionize weather forecasting, but still rely on initial conditions generated by costly Numerical Weather Prediction (NWP) systems. Although recent end-to-end forecasting models attempt to bypass NWP systems, these methods lack scalable assimilation of new types of observational data. Here, we introduce XiChen, an observation-scalable fully AI-driven global weather forecasting system, wherein the entire pipeline, from Data Assimilation (DA) to medium-range forecasting, can be accomplished within only 15 seconds. XiChen is built upon a foundation model that is pre-trained for weather forecasting and subsequently fine-tuned to serve as both observation operators and DA models, thereby enabling the scalable assimilation of conventional and raw satellite observations. Furthermore, the integration of Four-Dimensional Variational (4DVar) knowledge ensures XiChen to achieve DA and medium-range forecasting accuracy comparable to operational NWP systems, with skillful forecasting lead time beyond 8.75 days. A key feature of XiChen is its ability to maintain physical balance constraints during DA, enabling observed variables to correct unobserved ones effectively. In single-point perturbation DA experiments, XiChen exhibits flow-dependent characteristics similar to those of traditional 4DVar systems. These results demonstrate that XiChen holds strong potential for fully AI-driven weather forecasting independent of NWP systems.